A Novel Clustering Algorithm For Application To Large Probabilistic Tractography Data Sets

نویسندگان

  • R. E. Smith
  • J-D. Tournier
  • F. Calamante
  • A. Connelly
چکیده

Introduction The segmentation of brain white matter through clustering of tractography data is a problem that has attracted considerable attention in recent years. Approaches to date have succeeded in identifying the major white matter structures from diffusion tensor tractography, but suffer from a number of limitations, including the dependence upon definition of regions of interest, use of a similarity matrix which restricts the quantity of data which can be processed , requirement of manual segmentation, or a coarse resolution of clustering due to the use of a feature space 4, . These become critical when dealing with probabilistic tractography, where very large data sets must be produced to accurately represent the underlying biological structure, particularly when applied to the whole brain. We present a novel algorithm, capable of automated clustering of very large probabilistic track data sets (demonstrated on 1,000,000 tracks) at any chosen cluster scale. Methods Our clustering algorithm consists of three stages. The first efficiently produces clusters from the fiber set using a data stream clustering strategy; this consists of comparing each track in sequence to a set of exemplars, and producing a new exemplar whenever an incoming track is sufficiently unique. The second stage utilises a fast sparse implementation of the popular K-means algorithm, allowing clusters within a local neighbourhood to exchange tracks to reduce the global sum of squared error. The optional third stage merges neighbouring clusters for which the inter-cluster boundary is deemed arbitrary, based upon interand intra-cluster track distances. The second and third stages improve the reproducibility of the algorithm, as the results of data stream clustering are dependent upon the order in which the data are presented. The number of clusters is not set explicitly; rather it is a function of the selected scale of clustering, and the quality of discrimination between neighbouring clusters at that scale. The Hausdorff similarity metric with upper threshold was selected to appropriately cluster tracks at a very fine resolution. An additional post-processing stage applies an agglomerative hierarchical clustering approach to the resulting clusters to gradually identify larger structures within the brain. This approach provides logical sub-divisions of the primary white matter structures at a number of scales, down to cluster and even individual track level. A different similarity metric was used for the hierarchical clustering stage, since identifying clusters belonging to the same major pathways is a fundamentally different problem from the fine-scale clustering. This is illustrated in Figure 1: bundles which ought to be grouped together according to established human anatomy may not have high similarity according to the simple Hausdorff metric, and vice-versa. A novel metric was designed for this purpose, incorporating a number of measures including distance and directional coherence between tracks, as well as global information such as track density and continuity of track directionality between points. Results One million probabilistic streamlines were generated from DW data acquired on a 3T Siemens Trio from a healthy volunteer (2.3mm isotropic, 150 DWI directions, b=3000s/mm) using the MRtrix software package, with the fiber orientation distributions estimated using Constrained Spherical Deconvolution. The clustering of these tracks (including hierarchical classification) was achieved in approximately 4 hours, running on a single 2.1GHz Intel processor with 4GB of RAM. With clustering performed using a Hausdorff threshold of 10mm, ~16,000 individual clusters (of 5 tracks or more) were identified. Figure 2 shows coronal, sagittal and transverse projections of a number of major fiber bundles identified through hierarchical classification – note that these bundles can be further sub-divided for visualisation down to the desired scope.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A novel local search method for microaggregation

In this paper, we propose an effective microaggregation algorithm to produce a more useful protected data for publishing. Microaggregation is mapped to a clustering problem with known minimum and maximum group size constraints. In this scheme, the goal is to cluster n records into groups of at least k and at most 2k_1 records, such that the sum of the within-group squ...

متن کامل

DT-MRI Tractography and its Application in Cognitive Neuroscience

Recent advancement of MRI techniques and development of new methods of image analysis have allowed us to study large neural tracts within the human brain. This is based on the principle of diffusion tensor MRI that is similar to that of diffusion-weighted imaging but takes magnitude and direction of the diffusion of water into account. Using this technique we have been able to define large neur...

متن کامل

An Incremental DC Algorithm for the Minimum Sum-of-Squares Clustering

Here, an algorithm is presented for solving the minimum sum-of-squares clustering problems using their difference of convex representations. The proposed algorithm is based on an incremental approach and applies the well known DC algorithm at each iteration. The proposed algorithm is tested and compared with other clustering algorithms using large real world data sets.

متن کامل

Solving Data Clustering Problems using Chaos Embedded Cat Swarm Optimization

In this paper, a new method is proposed for solving the data clustering problem using Cat Swarm Optimization (CSO) algorithm based on chaotic behavior. The problem of data clustering is an important section in the field of the data mining, which has always been noted by researchers and experts in data mining for its numerous applications in solving real-world problems. The CSO algorithm is one ...

متن کامل

Solving Data Clustering Problems using Chaos Embedded Cat Swarm Optimization

In this paper, a new method is proposed for solving the data clustering problem using Cat Swarm Optimization (CSO) algorithm based on chaotic behavior. The problem of data clustering is an important section in the field of the data mining, which has always been noted by researchers and experts in data mining for its numerous applications in solving real-world problems. The CSO algorithm is one ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009